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Near-linear constructions of exact unitary 2-designs 
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Abstract 

A unitary 2-design can be viewed as a quantum analogue of a 2-universal hash function: it is 
indistinguishable from a truly random unitary by any procedure that queries it twice. We show 
that exact unitary 2-designs on n qubits can be implemented by quantum circuits consisting of 
O(n) elementary gates in logarithmic depth. This is essentially a quadratic improvement in size 
(and in width times depth) over all previous implementations that are exact or approximate 
(for sufficiently strong approximations). 


1 Introduction 

The uniform distribution on the group consisting of all unitary operations acting on n qubits 
is captured by the Haar measure , which is the unique measure that is invariant under left and 
right multiplication by any group element. Haar-random unitaries, by their symmetries, facilitate 
many analyses in quantum information HEED £21221123|- However, Haar-random unitaries have 
very high computational complexity, in that most of them cannot be efficiently implemented or 
reasonably well approximated by circuits of size polynomial in the number of qubits. They require 
many bits to describe. They also require a lot of randomness to sample. 

Unitary 2-designs are probability distributions on finite subsets of the unitary group that have 
some specific properties in common with the Haar measure. Several common definitions for unitary 
2-designs have been proposed and studied, each revolving around a specific property or applica¬ 
tion, and appropriate notion of approximation nm udi H2i- These 2-designs are computable by 
polynomial-size circuits with short specifications and low sampling complexity. 

We focus on exact unitary 2-designs. In the exact case, we will see that several commonly used 
definitions can be shown to be equivalent to each other. One particularly natural definition is that 
they are two-query indistinguishable from Haar-random unitaries. Imagine a game where, at the 
flip of a coin, U is sampled either according to the Haar measure or with respect to the unitary 
2-design. A two-query distinguishing procedure can make two queries to U (each being either in the 
forward direction as U or in the reverse direction as U^) as well as other quantum operations that 
do not depend on U and then outputs a bit. A unitary 2-design has the property that no two-query 
distinguishing procedure can distinguish between the Haar-random case and the 2-design case with 
probability greater than 1/2. By this definition, a unitary 2-design is a quantum analogue of a 
2-universal hash function [5j (or, more precisely, 2-universal hash permutation). We will show in 
Section [2] that this definition is equivalent to previous definitions, including those based on bilateral 
twirling [12] and channel twirling m- 
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1.1 Previous work 


The uniform distribution on the Clifford group has been shown to be an exact unitary 2-design in 
the sense of bilateral twirling |1_2] and channel twirling nm- This implies that the circuit complexity 
is 0(n 2 /log n) where the gates are one- and two-qubit gates from the Clifford group [1]. Moreover, 
the sampling cost is 0(n 2 ) random bits of entropy. In the context of bilateral twirling, jl9] shows 
that a certain process of random circuit generation (introduced in |13j l yields Ebiiaterai-approximate 
unitary 2-designs of size 0(n(n + log l/£biiaterai)), where £biiaterai measures the distance of the 
resulting operation from the ideal one. 

Another construction m yields circuits of size 0{n log l/e c hannei) f° r a notion of approximation 
that is natural for channel twirling; however, it has been pointed out that this notion of approx¬ 
imation could (at least conceivably) incur a blow-up by a factor that is exponential in n in the 
bilateral twirl context (see, e.g., Section 2 of m and Section 1.1 of |3j for discussion about this). 
For the more general setting, as far as we know, we might need £ c hannei < ^biiaterai/2 ?l — so the 
circuit size becomes 0(n(n + log l/£biiaterai))- 

For exact unitary 2-designs as well as approximations of them related to bilateral twirling, all 
of the above constructions incur circuits of size P(n 2 ) and require P(n 2 ) random bits of entropy. 

Reference [6] proves that there exists a small subgroup of the Clifford group that gives rise to 
an exact unitary 2-design that uses approximately 5n random bits of entropy. However, the circuit 
complexity for this construction is unknown, beyond the 0(n 2 / log n) bound that holds for any 
Clifford operation. References fTS] and |32| study the necessary and sufficient entropy for exact 
and approximate unitary 2-designs. Approximately An random bits of entropy are necessary. 

1.2 New results 

We give three constructions of exact unitary 2-designs on n qubits that have the following quantum 
gate costs (number of one- and two-qubit gates): 

• 0(n log?i log log n) gates (all Clifford gates) for infinitely many n, assuming the extended 
Riemann Hypothesis is true. 

• 0(n log n log log n) gates (including non-Clifford gates) for all n, unconditionally. 

• 0(n log 2 n log log n) gates (all Clifford gates) for all n, unconditionally. 

The circuits for the first two constructions can be organized so as to perform their computation in 
O(logn) depth; the third in 0( log 2 n)-depth (using the fact that efficient multiplication/convolution 
algorithms require only 0(logn)-depth [33]). These results are near optimal - in Appendix |E| we 
show that for any unitary 2-design (exact or approximate under Definition [2] or |3|, a high probability 
set of the unitaries have size Q(n) and depth H(logn). 

All three constructions above use 5n bits of randomness (more precisely, they sample from 
a uniform distribution on a set of size 2 5n — 2 3n ). They all consist of unitaries in the Clifford 
group (even in the second construction, non-Clifford gates are used to compute Clifford unitaries 
efficiently). The circuits use O(n) ancilla qubits (where each ancilla qubit is initially in state |0) 
and is restored to this state at the end of the computation). Finally, the cost of the classical process 
that samples these unitary 2-designs (outputs a description of the quantum circuit) is polynomial 
in n. The cost is dominated by the complexity of computing square roots in the finite field GF(2 n ). 
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It should also be noted that our Definition [2] is a new characterization of unitary 2-designs in 
terms of 2-query indistinguishability that may be of independent interest. 

1.3 Significance of the new constructions 

Since our constructions yield exact unitary 2-designs, they are automatically valid for all notions and 
definitions of approximate unitary 2-designs. Our construction thus achieves the minimum known 
circuit size, depth, and sampling complexity simultaneously, among both exact and approximate 
unitary 2-designs. 

Exact 2-designs offer other advantages. Besides the original operational applications of bilateral 
and channel twirling, 2-designs have appeared in second moment analysis. For example, they arise in 
|21j . where results are obtained about the decoupling of two quantum systems and quantum channel 
capacities. An exact 2-design can be used in a “plug-and-play” manner. For example, there exists 
an encoding operation in any unitary 2-design that, when concatenated with an appropriate inner 
code, achieves the quantum channel capacity. Thus, our results automatically imply the existance 
of such encoding circuits of 0(n log 2 n log log n) Clifford gates and depth 0(log 2 n). 

In some applications such as decoupling, the distance from an exact 2-design is amplified by a 
dimensional factor that can be exponential in n (for example, Theorem 1 in [35]). Using our exact 
construction, such error term vanishes exactly, so does the issue of the exponential amplification of 
errors. Thus our results yield potentially tighter bounds while maintaining a circuit size of O(n). 

Prior to our work, [3] constructs a method to generate random circuits of size 0{n log 2 n) and 
depth 0(log 3 n) that does not give rise to a 2-design, yet achieves decoupling and provides small 
encoding circuits for quantum error correcting codes [4]. The advantage in their approach is that 
no ancillas are needed, and the circuit may model some random physical processes. However, the 
depth is higher, and a substantial amount of analysis is required in the aforementioned references 
to show that the construction and circuit size indeed achieve the tasks with the desired accuracy. 
Adapting their construction to other applications may also require additional analysis. 


2 Definition of a unitary 2-design 

We first discuss several definitions that are equivalent to the concept of unitary 2-designs. 

Let Ujv denote the group of N x N unitary matrices. We are interested in distributions over 
U/v- The Haar measure on Uat is the unique measure on Ujv that is invariant under left and right 
multiplication by any U £ Ujv- We denote the Haar measure by p(U). Let £ = {pi, Ui}^ =1 denote 
a finite ensemble of unitary matrices U\ , U- 2 , • • • ,14 G Un where pi > 0 and YhiPi = 1- 

Sampling from the Haar measure is a powerful technique in quantum information theory. Some¬ 
times, we use a physical procedure that averages over such random choices of unitary transformation 
(for example [12L 10] ). Some other times, we have a randomized argument, for example, in the proof 
of quantum channel capacity HH EH [291 EU, in which the average performance over all possible 
unitary encodings is evaluated. 

We are interested in contexts in which such sampling from the Haar measure can be replaced by 
sampling from a finite ensemble £ = {pi, Ui}^ =1 of unitary matrices. This can reduce the required 
resources such as shared randomness, communication to implement the random unitary, as well 
as the computational complexity of implementing the randomly chosen unitary. We now discuss 
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several of these circumstances. 


The first context is concerned with the expected value of polynomials of the entries of unitary 
matrices drawn according to some distribution. This definition is essentially the original definition 
of unitary 2-design in |10] , and is useful for proving results in other contexts. 

Definition 1. We say that £ is degree-2 expectation preserving if, for every polynomial j(U) of 
degree at most 2 in the matrix elements of U and at most 2 in the complex conjugates of those 
matrix elements, 

k 

PiliUi) = 

i= 1 



/ 


dfi(U) j(U). 


( 1 ) 


In Eq. 0 and throughout the paper, an integral written without a specific domain is taken over Utv- 

The second context is concerned with distinguishing whether a random sample U is drawn from 
the Haar measure or from the ensemble £, when an arbitrary distinguishing circuit is allowed to 
make a total of at most two queries of U or U'. The most general circuit C of this form is depicted 
in Figure [l] 



Figure 1: Illustration of a 2-query distinguishing circuit C. The first query Qi can be U 
or U\ likewise for the second query Q 2 . The initial state p is arbitrary, V is an arbitrary 
unitary, and the final measurement outputs one bit but is otherwise arbitrary. 


The circuit C starts with an arbitrary initial state p (a positive semidefinite matrix of trace 1). 
Then, the first query, an arbitrary operation V, the second query, and an arbitrary final measure¬ 
ment that outputs one bit are applied in order. We call any such circuit a 2-query distinguishing 
circuit. If U is drawn from either ensemble, denote the quantum state right before the measure¬ 
ment as r/ 2 (C, U). If U is drawn from £, the density matrix in C before the final measurement 
is Pih'iiC, U)] similarly, if U is drawn from the Haar measure, the density matrix before the 
final measurement is f dp(U)rj 2 (C,U). The output bit of the circuit C has the same distribution 
regardless of which ensemble U is sampled from, if and only if the above two density matrices are 
equal. The following definition describes ensembles that cannot be distinguished by any 2-query 
distinguishing circuit C. 

Definition 2. We say that £ is 2-query indistinguishable, if, for any distinguishing circuit C 
making up to two queries of a random unitary or its adjoint, 

k 

Pi 172(C) Ui) = 

i=l 



dp(U) V2 (C,U). 


(2) 


The next context is a special case of the scenario depicted in Figure [lj where U is queried 
twice in parallel, as illustrated in Figure [2j Consider bipartite operations in which two disjoint 
systems undergo the same unitary transformation drawn according to some distribution. These 
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Figure 2: Illustration of the bilateral twirl: querying U ®U. The initial state p is arbitrary. 


operations are sometimes called bilateral twirls mm- The £ bilateral twirl is defined as the 
quantum operation 

k 

T£(p) = Y J PiVi®U i )p(ul®ul). (3) 

i= 1 

The full bilateral twirl is defined as the quantum operation 

%{p) = J dp{U) {U®U)p(y'®tf). (4) 

The full bilateral twirl is motivated operationally mm and it appears in various mathematical 
proofs in quantum information [211135] . Definition [3] describes ensembles that derandomize the full 
bilateral twirl. 

Definition 3. We say that the ensemble £ implements the full bilateral twirl ifTn(p) = Ts(p) for 
all p. 

The fourth context is concerned with the task of converting any quantum channel into a depo¬ 
larizing channel of the same average fidelity. This conversion has many important applications, for 
example, benchmarking (for estimating average channel fidelity) of quantum devices [10] and error 
estimation (for detecting eavesdropping) in quantum key distribution [bj. 

Let A be any quantum channel that maps TV-dimensional quantum states to TV-dimensional 
quantum states. An <f-channel-twirl of A, denoted by E^-(A), is defined as the quantum channel 
that acts as 

k 

E £ (A):p^Y / PiUjA(U i pUj)U i . (5) 

i=1 

In other words, a random change of basis is applied to the system before the channel A acts and it 
is reverted afterwards. A full channel twirl of a quantum channel A is given by 

E„(A) : p^ J dp(U)U^A(UpU^)U. (6) 

Definition 4. We say that £ implements the full channel twirl if Eg (A) = E^(A) for all quantum 
channels A. 

Lemma [l] below states that these four relationships between ensembles and the Haar measure are 
equivalent. Thus, we can think of an ensemble satisfying one of the conditions in alternative ways. 
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Lemma 1. Let £ be any ensemble of unitaries mUjv• Then, the following are equivalent: 

(1) £ is degree-2 expectation preserving. 

(2) £ is 2-query indistinguishable. 

(3) £ implements the full bilateral twirl. 

(4) £ implements the full channel twirl. 

The following corollary of Lemma [l] is not obvious from Definitions [3] and [4] alone. 

Corollary 1. For £ = { Pi , Ui}\ =1 , let £’ t :=|p i , C//| . 

(a) £ implements the full bilateral twirl if and only if does. 

(b) £ implements the full channel twirl if and only if £^ does. 

We note that additional definitions have been discussed in PHI E21 ESI E7j. Several parts of 
Lemma [I] have been proved in literature mmm- In particular, m relates definitions (1), (3), 
and (4) with bounds on the approximations. We provide a complete (alternative) proof of LemmaJI] 
and Corollary [I] in Appendix |A| 

Due to Lemma[lJ when we do not need to specify the context, we just call an ensemble satisfying 
any one of the four conditions a “unitary 2-design.” 


3 Pauli mixing implies a unitary 2-design 


We describe a simple sufficient condition for £ to be a unitary 2-design. 

We begin by reviewing some basic definitions and terminology associated with the Pauli group. 
Let X = (iq), Y = (9 “*), and Z = (J denote the 2x2 Pauli matrices. For any a E (0, 1}", 
define X a = X ai <g> • • • <g> X an and Z a = Z ai <g> • • • <g> Z an . 

Definition 5. The Pauli group V n consists of all operators of the form i k X a Z b , where k £ 
(0,1,2,3} and a, b £ {0,1}" . Let Q n = P n /{±1 , ±i}> the quotient group that results from dis¬ 
regarding global phases in V n (each element of Q n can be represented as P a ^ = X a Z b ). We call 
Po o = I the trivial Pauli. 


Let H = ^ (] _)) (the 2x2 Hadamard matrix), S = (J 9) (the phase gate), and 

0 0 0 \ 

1 0 0 
0 0 1 ' 

0 1 0 / 


CNOT = 


/I 

0 

0 

\0 


(7) 


Definition 6. The Clifford group C n is the set of all unitary matrices that permute the elements 
of V n (and thus Q n ) under conjugation. 


The Clifford group C n contains the H , CNOT, and S gates, and they form a generating set m- 
Conjugating the elements in V n by some U £ C n gives a permutation on V n ; this also induces a 
permutation 7 tjj on Q n . 
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Definition 7. Consider an ensemble £ = {pi, U l }^ =] of unitary matrices U i ; t /2 , ■ ■ ■ , 14 in the 
Clifford group C n . We say that £ is Pauli mixing, if for all P G Q n such that P 4 I, the distribution 
{Pii'Ku i {P)} is uniform over Q n \{/}. 

For any ensemble £ = {pi,Ui}^ =1 , let £q = {‘2^ 2n pi, j=] where Rj ranges over all 

elements in Q n . Intuitively, £q is the ensemble where a random element drawn from £ is preceded 
by a random Pauli operation drawn from Q n . 

Pauli mixing by £ is a sufficient condition for the ensemble £q of Clifford unitaries to be a 
unitary 2-design. More specifically, we have the following lemma. 

Lemma 2. Let £ be an ensemble of Clifford unitaries and £q be defined as above. If £ is Pauli 
mixing, then £q implements the full bilateral twirl. 

The original proof of Lemma[2]can be found in [[12] . A short proof based on representation theory 
can be found in [18]. In Appendix [P] we provide an elementary proof that may be of independent 
interest. This proof uses some ideas from m but has fewer assumptions. In particular, the new 
proof does not rely on knowing how to evaluate (in closed form) the full bilateral twirl of an 
arbitrary input state, nor on knowing the invariants of the full bilateral twirl. It is known how to 
evaluate the full bilateral twirl using representation theory or the double commutant theorem. Our 
new proof derives this result (how the full bilateral twirl acts) on the side. 

Note that, in light of Lemma[lJ an alternative way of proving that, whenever £ is Pauli mixing, 
£q is a unitary 2-design is to use Definition [4] This can be shown in the following two steps. First, 
conjugating any channel by a uniformly random Pauli operation drawn from Q n yields a mixed- 
Pauli channel (a channel that is a probability distribution on the Pauli operators). This is proved 
in m- Second, it is clear that, if £ is Pauli mixing, then conjugating any mixed Pauli channel by a 
random element of £ results in a depolarizing channel with the same average fidelity as the mixed 
Pauli channel. This corresponds exactly to an implementation of the full channel twirl. 


4 Pauli mixing using the structure of SL9(GF(2 n )) 

For the purposes of analyzing the Clifford group and its action on the Pauli group elements X a Z b = 
{X ai (g) - - - <g) X an )(Z bl ( 8 > • • ■ ( 8 ) Z bn ), it is fruitful to associate a and b with elements of the Galois 
field of size 2 n . However, for this association to work well technically, we work with two different 
representations of field elements. If a is represented in some primal basis then b is represented in 
the dual of that basis. This section explains this basic framework. 

4.1 Review of some properties of Galois fields GF(2") 

Let GF(2 n ) denote the Galois field of size 2 n (more information about these fields can be found 
in [28]). The elements of this field form a vector space over GF(2) so the notion of a basis is well- 
defined: ui\,... , ui n G GF(2 n ) are a basis if they are linearly independent and span the field; a basis 
enables us to associate the elements of GF(2 n ) with n-bit strings. A polynomial basis of GF(2 n ) is 
a basis that is of the form 1, a, a 2 ,..., a n_1 , for some a G GF(2 n ). The standard constructions of 
GF(2 n ) in terms of irreducible polynomials result in a polynomial basis. However, there are bases 
that are not necessarily of this form, and these arise in our constructions. For instance, a normal 
basis of GF(2 n ) has the form a 2 °, a 2 *, .. ., a 2 " 1 for some a G GF(2 n ). 
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The dual of a basis is defined in terms of the trace function T : GF(2 n ) —> GF(2), which is 
defined as T(a) = a 2 ° + a 2,1 + • • • + a 2 " 1 . The trace has the property that T(a + b) = T(a) + T(b), 
for all a,b £ GF(2 n ). In terms of T, we can define the trace inner product of a,b £ GF(2 n ) as 
T(ab). Now, for an arbitrary basis uq, ... ,u n £ GF(2 n ), that we refer to as the primal basis, we 
can define its dual basis as the unique uq,... , tD n £ GF(2 n ) such that 


T(uJiUj) 


1 if i = j 
0 if i / j. 


( 8 ) 


We can associate the elements of GF(2 n ) with n-bit binary strings by taking coordinates with 
respect to a basis. To facilitate discussion, we use the following notation. With respect to any 
primal basis uq,..., oj n and its dual Cj i, ..., C : n , for a £ GF(2 n ): 


• |"a~| £ {0, l} n denotes the coordinates of a in the primal basis. Thus, a = [a] i uq+- ■ •+ \a] n u n , 
which is achieved by setting \a]j = T(aCbj) for all j £ {!,••■ , n\. 

• |_aj £ {0, l} n denotes the coordinates of a in the dual basis. Thus, a = |_aj l aq + • • • + |_ajn w ra , 
which is achieved by setting [a] j = T(auij) for all j £ {!.••• ,;//}. 


In some places, where the meaning is clear from the context, it is convenient to write a in place 
of |"a~|. Also, it is sometimes convenient to think of n-bit binary strings as { 0 , l}-valued column 
vectors of length n. Thus, [a] and |_«J are sometimes interpreted as binary column vectors of 
length n. Binary matrices acting on these vectors (in mod 2 arithmetic) are written with square 
brackets. 

It is straightforward to show that the conversion from primal to dual basis coordinates corre¬ 
sponds to multiplication by the n X n binary matrix 

T(uiw i) ••• T(uiu n ) 

W= : : (9) 

T(u n ui) ■■■ T(u n uj n )_ 

That is, [oj = W\a] (with matrix-vector multiplication in mod 2 arithmetic). Also, T{ab) is the 
dot-product of the coordinates of a in the primal basis and the coordinates of b in the dual basis: 

T(ab) = [a] • | 6 J = fa]i[ 6 Ji H-f [a~| n |_fr|n mod 2. (10) 

The dual of the dual basis is the primal basis. A basis is self-dual if lu 1 = uq for all i. 

Relative to any basis, multiplication by any particular r £ GF(2 n ) is a linear operator in the 
following sense. There exists a binary nxn matrix M r such that, for all s £ GF(2 n ), |Ys] = M r \s\ 
(with mod 2 arithmetic for the matrix-vector multiplication). In fact, this matrix M r is 

T(rwiwi) ••• T(rwiw n ) 

M r = : : , (11) 

T(ruJnOJi) ■■■ T(rLa n oj n )_ 

and its transpose ( M r ) J corresponds to multiplication by r in the dual basis (that is, [rs\ = 
(M r ) T [sJ). It should be noted that algorithms for multiplication in GF(2 n ) are basis dependent; 
the obvious cost of converting between two bases is 0(n 2 ). 
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4.2 Pauli mixing from a subgroup isomorphic to SL 2 (GF(2 n )) 


Due to Lemma [2j it suffices to compute an ensemble of Clifford unitaries that is Pauli mixing. 
Relative to a (primal) basis, we associate each pair a,i) G GF(2 n ) with the Pauli group element 
= (X I " 0 ! 1 <gi ■ ■ ■ <gi xl" a l n )(zL 6 J 1 ® • ■ ■ <g) zL 6 J n ). Chau [B] showed that there is a subgroup 
G of the Clifford group of size 2 °( n ) suc h that sampling uniformly over 6 performs Pauli mixing. 
We now give an overview of the approach in [Bj (translated into our language). The subgroup C is 
isomorphic to the special linear group of 2 x 2 matrices over GF(2 n ): 


SL 2 (GF(2 n )) = | : a, 0,7,6 E GF(2 n ) such that a6 + = lj. 

Note that SL 2 (GF(2 n )) has 2 3n —2 n elements. The subgroup G induces a group action of SL 2 (GF(2 n )) 
on the Paulis by conjugation by certain unitaries. 

Definition 8. With respect to a primal basis for GF(2 n ), we say that a Clifford unitary U induces 
M E SL 2 (GF(2 n )) if, for all a,b € GF(2 n ) and 



( 12 ) 


UX^Z L b J C/t = J, 


(13) 


where = means equal up to a global phase in {1, i, —1, — i} that is a function of M, a, and b. 


To rephrase the above definition, suppose M = (“^) E Go- Then, for all a,b, conjugating 

(X I"®"! 1 ® • • • (g) X^ a ^ n )(Z^ 1 ® • • • ® Z^ n ) by the Clifford unitary U yields ( y X^ aa+ ^ 1 ® ■ ■ ■ ® 
X \aa+(>b-\ n ^ Z l'ya+5b} 1 0 ® Z ha+Sb} n ^ up tQ a phase . 


We adopt the following notational convention throughout the paper. We write matrices in 
SL 2 (GF(2 n )) and vectors of length 2 over GF(2 n ) using parenthesis (see above) to distinguish the 
binary matrices and vectors described in the previous subsection which use square brackets. 


It should be noted that, in [6], Eq. (13) is expressed using different notation for the Paulis, that 
we call subscripted Paulis, defined as satisfying X a \c) = \a + c) and Zb\c) = (—l) T ( 6c )|c). It is easy 
to express these in terms of our superscripted Paulis, X^ al{ and Z ^, as X a = and = Z^ 
(since T(bc ) = [6J • fc~|). The occurence of the dual basis in Z ^- b J (which is equivalent to using Zb) 
in Eq. © is not merely a matter of convention: for general M E SL 2 (GF(2 n )) there does not exist 
a unitary U that induces M in the sense that UX^°^ Z^^W = X^ a '^Z^ b ' 1. In terms of Definition |ij 
the following holds. 


Lemma 3 ( |6] ). With respect to any primal basis o/GF(2 n ) and every M E SL 2 (GF(2 n )), there 
exists an n-qubit Clifford unitary U that induces M. 


Definition 9. Consider M E SL 2 (GF(2 n )). Let Um denote a unitary that induces M with respect 
to the primal basis; Um is unique up to multiplication by a Pauli (a proof of this can be found in 
and is also provided in Appendix [ 7 ] Lemma [$|). Similarly, let Um denote a unitary that 
induces M with respect to the dual basis. 


The proof of Lemma [3] in [B] exhibits a possible choice of Um f° r an Y M E SL 2 (GF(2 n )). 
However, it is unclear how to implement that Um as a small quantum circuit, except for the fact 
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that Um is in the Clifford group, so its gate complexity is 0(n 2 / log n) by [T|. Our results in 
subsequent sections amount to an alternative proof of Lemma [3] for certain bases of GF(2 n ), as well 
as a modified version of this lemma. This enables us to ultimately attain gate constructions of size 
0(n) that implement unitary 2-designs. The relationship between Lemma [ 3 ] and unitary 2-designs 
is based on the fact that the uniform ensemble over {Um ■ M £ SL 2 (GF( 2 n ))} is Pauli mixing, 
which is a consequence of the following. 

Lemma 4 ([6j 1T8|). Let Go denote the set of all non-zero elements of GF(2 n ) x GF(2 n ). Let 
M £ SL2(GF(2 n )) be chosen uniformly at random. Then, for any (£) £ Go, 



is uniformly distributed over Go. 

Proof. We first show that SL2(GF(2 n )) acts transitively on Go- Let (§) £ Go- If c / 0, then, 
(dc-0(o) = id)- If c = 0, then d / 0, so, (° ) (J) = ( c d ). Thus, we can map any (£) £ G 0 

to ( 0 ) and then to any other ((j 2 ,) £ Go using elements of SL2(GF(2 n )). 

To prove the lemma, suppose, by contradiction, that there are distinct (), ( c ff ) such that 
Prob = ( d, )} = Pi, ProbM {M (° h ) = {%)} = P2 and pi > p 2 - But there exists an 

M' £ SL 2 (GF(2 n )) such that M'{%) = (Z)- So, Prob m {M'M{%) = {%)} > Pi- But the 
distribution over M is the same as the distribution over M'M , so the left side of the last inequality 
is p 2l which is a contradiction. □ 

Our goal is to implement Um (for M £ SL2(GF(2 n ))) with quantum circuits consisting of 0(n) 
Clifford gates. The interplay between the primal basis and the dual basis is a major complicating 
factor that we address using two different approaches. In one of our approaches we modify the 
framework of SL2(GF(2 n )). 

Our approach in Section [ 5 ] is based on a self-dual basis for GF(2 n ) and the structure of 
SL2(GF(2 n )). Our approach in Section [6] is based on a polynomial basis for GF(2 n ) (and its 
dual) and the structure of two subgroups of SL2(GF(2 n )): the lower triangular subgroup and the 
upper triangular subgroup. These are defined respectively as 

A 2 (GF(2 Tl )) = | ^ : a, 0 £ GF(2 n ) and a± 0 j (15) 

V2 (GF(2")) = j?,) :a,/3£GF(2-)anda^o|. (16) 

These subgroups have interesting mixing properties, albeit weaker ones than SL2(GF(2 n )), which 
are explained in Section [6} 

4.3 A framework for implementing elements of SL 2 (GF(2”)) by unitaries 

We first show that all elements of SL2(GF(2 n )) can be written as a product of a small constant 
number of matrices in a generating set—and more restrictive generating sets for A 2 (GF( 2 n )) and 
V2(GF(2 n )). Then we describe Clifford unitaries that induce these generating matrices. In sub¬ 
sequent sections, we show how to implement these unitaries with O(n) quantum gates, thereby 
implementing elements of SL 2 (GF( 2 n )). 
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Lemma 5. Every element M E SL 2 (GF( 2 n )) can be expressed as a product of a constant number 
of the following elements of SL2(GF(2 n )): 


r 0 
0 r” 1 


1 0 
1 1 


0 1 
1 0 


(17) 


where r E GF(2 n ) is non-zero. 


Proof. For any M = 


a 7 
P 5 


a 7 
P 5 


E SL2(GF(2 n )), we can decompose it into a product as follows: 


= 


1 0\ 1 cry \ fa 0 

i ij [o i ) [o 

'1 o\ / 7 0 \ fo l\ 

J v v° 'TV v 1 °y 


if a ^ 0 


if a = 0. 


(18) 


Furthermore, for any non-zero s E GF(2 n ), there exists t E GF(2 n ) such that t 2 = s (explicitly 
t = s 2 " ). This permits us to decompose further as 


and 


1 0 
s 1 


1 s 
0 1 


t~ l 0\/l 0\/i 0 

0 t VI l) VO t~ l 


0 1 \ 1 0 \ /0 1 
1 0/ Is 1/ VI 0 


(19) 

(20) 
□ 


It is easy to specialize the above lemma to the lower triangular and upper triangular matrices 
in SL 2 (GF( 2 n )) as follows. 

Lemma 6 . Every element of A 2 (GF( 2 n )) can be expressed as a product of a constant number of 
elements of the form 


(o A) and (1 i) (21 > 

and every element V 2 (GF( 2 n )) can be expressed as a product of a constant number of elements of 
the form 

(o A) mi G 0’ (22) 

where r E GF(2 n ) is non-zero. 

In view of Lemma[5j for every M in SL2(GF(2 n )), we can find a unitary that induces M if we 
find a unitary that induces each of (}i), (S J), and (£ ) for any non-zero r E GF(2 n ). Similar 

statements hold for A 2 (GF( 2 n )) and V 2 (GF( 2 n )) with their respectively generating sets shown in 
Lemma [6l 
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First consider any non-zero r E GF(2 n ), and the element of SL2(GF(2 n )) of the form 



( 23 ) 


A Clifford unitary that induces ( (J 2 i ) is the multiply-by-r (in the primal basis) operation II r 
definccQ as II r | |"c]) = | \rc\) . To improve readability, we henceforth denote | |"c]) by |c). For 
example, in this notation, II r |c) = | re). Now, to check that n r induces (^ ), note that, for all 
re CF,(2"). 


Furthermore, 


| c ) = n r xH| r .- 1 c ) 

(24) 

= n r r -1 c + a) 

(25) 

= |c + ra) 

(26) 

= xr™i| c ). 

(27) 

U r Z w Ul\c) =U r Z^\r~ 1 c) 

(28) 

= n r (-i)L 6 Hr- lc i| r -i c ) 

(29) 

= (-l) T ( 6r_lc )|c) 

(30) 

= (—1) • f c 1 | c ) 

(31) 

1 

-o 

II 

(32) 


It follows that, for all a, b E GF(2 n ), II r Xl"“l Z^ r 4 In other words, II r induces 

(SrM- 

We can write any ( 6 ) E SL2(GF(2 Tl )) in a primal-dual basis as J^j E {0, l} 2n , where |"a], [b\ E 
{0, l} n . Recall that to distinguish elements of SL2(GF(2 n )) from their corresponding binary vectors 
in a primal-dual basis, we use parenthesis to denote the former and square brackets for the binary 
vectors and their linear operators. 

We summarize the effect of conjugating a Pauli X ^“1 Z by II r on the binary strings [a] and 
[b\ as the following mapping on 2n-bit strings: 


T4 


[~ra~| 


M r |"a] 

_IAI. 

i — y 

Lr _1 6J 


s 

1 

1 _ 


(33) 


Here M r is the linear operator corresponding to multiplication by r in the primal basis, as defined 
in Eq. ©; (M r -i) T , the transpose of M r ~ i, which (due to the form of Eq. ©) is the linear 
operator corresponding to multiplication by r _1 in the dual basis. 

The following definition is similar to Definition [8j 

Definition 10. We say that a Clifford unitary U induces the 2 n x 2 n binary matrix M, if, for all 
n-bit strings s, r, 


UX r Z a rf = X r 'z s \ where 

r' 

= M 

r 


s' 


s 


( 34 ) 


1 This unitary operation acts on computation basis states similarly to M r defined in Sec. 4.1 Eq. (111. 
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Here, = means equal up to a global phase in {1, z, — 1, —i} that is a function of M, r, and s. We 
also say that U induces the mapping v H > Mi;. 


For example, II r induces the mapping given by Eq. (33) and the matrix 


M r 

0 


0 

(M r - i) T 


Now we return to hnding unitaries that induce elements of SL2(GF(2 n )). Consider the element 
(}5) °f SL2(GF(2 n )). The Clifford unitary that induces (} ?) should transform the Pauli 
along the lines of the mapping 


(35) 


T a l" 




r«i 


r°i 


' 1 o' 

'M' 


La + h\ 


_L«J + IAI. 


W\a] + [b\_ 


W I 

.L*J. 


where W is the linear operator for primal-to-dual basis conversion defined in Eq. ©> 
For any symrnet 
T v that implements 


For any symmetric n x n binary matrix V, we show that there is a diagonal Clifford unitary 

. The unitary Ty is defined as 


V I 


r v \ c ) = i E " =1 E * =1 VjkCjCk \ c ) 


(36) 


We begin with some preliminary observations. Since, for all i,j E {1,... ,n}, V t j = Vji , an equiva¬ 
lent definition is 


v 


== ^ ^7— 1 Log? ^<j<fc<n Vjk^j Cjc I 


(37) 


From Eq. (37), it is clear that Ty is in the Clifford group, since it is computed by the following 
composition of gates: an S gate acting on each qubit j for which Vjj = 1; and a controlled-Z gate 
acting on qubits j and k for each j < k where V7 = 1 (all these gates commute). This generic 
construction consists of 0(n 2 ) gates. In Sections and pi for the primal-to-dual basis conversion 
matrix W (which is symmetric from its definition in Eq. (19])) , we exhibit circuits implementing T^y 
with 0(n ) gates. 


To check that T\y induces the mapping in Eq. (35), it is convenient to separately consider the 
diagonal and off-diagonal entries of W. Let W = D + E, where D is diagonal and E }] = 0 for all 
j E {1,..., n}. This allows us to write Fw = Fe+e = F dT e , as a direct consequence of 


(38) 


-1 

o 
_1 


o 
_1 


7 o' 

W I 


D I 


E I 


Then, from the discussion following Eq. (37), we know that Fe = (£)■■■(%) S Wnn and it is 

straightforward to check that 


p^^[a]pt _ ^11014 fWnnfln j j^\a\ ^D\a\ 


(39) 
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where we are using the notation a* 


[~a~|i. For Y E , we have 


T E X^ a ^ E \c) = r £ lW(-l) E "=^N+i' ,, i‘ c i c ‘| c) 

= r E (-l) E "=i E ^+1 ^ feC ^ Cfe |a + c) 

— ^5Hj = l Y2k=j + 1 W^fc(( a J+ c j)(°fc+ c fc)+Cj c fc) _|_ 

— ^Ej = l 5Zifc=j +1 Wjk( a j a k+ a jCk+ a kCj) jjj _|_ w 
= (_1)£?=1 T,k=j+i Wjkaja k ^_^\c\-E\d] | a + ^ 

= (_]_)E"=i £fc=j+i (_1) T C 1 -^7r«l | c ) 

= £"=i £fc=j+i W jk aja k ^-\a\z E \<A | c ) _ 


Combining Eqs. (39), (|46|), and the fact that Ty/ commutes with every , we have 


= ^i^ =i ^= iWjkajak ^j yH^H+^w+w 

_ G £j=l £k=l ^jk a j a k\ \a\ fa] + f6J 


(40) 

(41) 

(42) 

(43) 

(44) 

(45) 

(46) 


(47) 

(48) 


which implies that Yw induces the mapping in Eq. (35). 


For completeness, a unitary operation that induces the element (5 J) of SL2(GF(2 n )) should 
also be considered. This is addressed in sections [5] and [6] in very different ways, and we defer the 
discussion of this to those sections. 
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O(n) implementation based on self-dual basis for GF(2 n ) 


We want to find 0(n)-sized circuits to implement unitaries that induce the generators of SL2(GF(2 n )) 
Our first approach is to represent GF(2 n ) in a self-dual basis. The advantage of using a self-dual 
basis is that, the change of basis operation W defined in Eq. ([9]) is simply I. Since there is no 
distinction between coordinates in the primal and the dual bases, we omit the [ ] and |_ J no¬ 
tations in this section. For all ?r-bit strings a, b, S® n X a Z b (S^)® n = z aiH Hn* mod 4^-a^a+b anc } 
_ (—Y) a ' b X b Z a . Therefore, S® n and H &n respectively induce (} 5) and (? J)- 

The challenge of using a self-dual basis lies in the implementation of the unitary II r (field 
multiplication) that induces ( q ). Fast multiplication methods with respect to a polynomial 
basis are known; however, no polynomial basis of GF(2 n ) is also self-dual if n > 2 [23]. Our solution 
is to use special self-dual bases that can be efficiently converted to and from polynomial bases. These 
special self-dual bases are constructed with Gauss periods , and are known for admissible n’s (see 
Definition |11| below). According to [37], there are infinitely many admissible n’s under the extended 
Riemmann Hypothesis. Our implementation in this section is restricted to these values of n: 

Definition 11. A natural number n is called admissible if the following two conditions hold: 


(1) 2n + 1 is prime 

(2) gcd(e, n) = 1. where e is the index of the subgroup generated by 2 in Z?; n+1 . 

In the above, Z*% n+ i denotes the multiplicative group of Z 2 n +i. Since Z^,^ has 2n elements, 
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In the remainder of this section, we first describe the procedure of finding a self-dual basis using 
Gauss periods, and briefly explain the efficient conversion between these two representations. Then 
we describe the implementation of II r . 

Since, for admissible values of n, 2n + 1 is prime, Fermat’s Little Theorem implies 2 2n = 
1 mod 2n + 1. So 2n + 1 divides 2 2n — 1, which implies that there is a primitive (2 n + l)-th root of 
unity ft G GF(2 2n ). One way to get P is the following. Let £ be a generator of the multiplicative 
group of GF(2 2n ). Because £ 22,! ~ 1 = 1, we can take p = £( 22n ~ 1 )/( 2n + 1 ) m Consider the set 

S = {P + P~\p 2 + P~ 2 , ■ ■ •, P n + P~ n }- (49) 

We first show that S is a self-dual normal basis of GF(2 n ) over GF(2) (as defined in Section [4.1[ ). 
Then we show how to efficiently convert between S and a polynomial basis. 

First we show that for an admissible to, 2 and —1 generate 7L \ n+1 (i.e., (2,-1) = 7L\ n +i)- A 
proof is given in ns], and it can be rephrased as follows. Let 7 generate the cyclic group Jj* 2n+1 . If 
e is the index of (2) in ^.n+ii then 2 = 7 e . Furthermore, 7” = — 1. Since gcd(e, to) = 1 , there are 
integers £7, k2 such that 1 = ek\ + nk-2 and therefore, 7 G (2,-1), so 7 L \ n+1 = (2, — 1 ). 

Our next step showing S is a self-dual basis follows from [38]. Since T ,\ n+1 = (2, —1), it follows 
that 

{2°, -2°, 2 1 , —2 1 ,..., 2 n ~ 1 , —2 n ~ 1 } = {1, -1, 2, -2,..., n, -n} mod 2 n + 1. (50) 

and we can reorder the elements of S as 


{P 2 + /r 2 ", ^ + /r 2 , • • ■, P 2 + P - 2 } 


(51) 


, 2 ° 


} where a = P + P 


-1 


is 


The set in Eq. (51) as a subset of GF(2 n ) is equal to { a“ , a" , • • • , a 
called a Gauss period of type (n, 2) over GF(2). It is easy to see that (5 + j3~ l G GF(2 n ), for one 
can verify that (/3 + /3 -1 ) 2 " = /3 + /3 -1 . 

Finally, we need to show that S is a basis. We invoke Theorem 3.1 in m which implies that a 
is a normal element in GF(2 n ) (generating a normal basis as defined in Section 4.1). Then, from 
Corollary 3.5 in m, any normal basis of Gauss period of type (n, 2) over GF(2) is self-dual when 
n > 2, so, S is self-dual, as claimed. 


Next, we show how to efficiently convert between S and a polynomial basis. We define a 
mapping from GF(2 n ) to {0, l} n+1 as follows. If a G GF(2 n ), then a' = [0, a±,--- ,a n ] T , where 
a = ai(/3 + P~ l ) + ... + a n (/3 n + P~ n ). In other words, a' is the coordinate of a with respect to the 
spanning set {l,/3 + /3 -1 ,/3 2 + /3 -2 ,..., fi n + f3~ n }. Including the element 1 makes this spanning 
set not a basis, but significantly simplifies the conversion between the following two spanning sets: 


5' ={ 1 ,P + P~\P 2 + P~ 2 , ...,P n + P~ n } , (52) 

t ={i,p+ p~\ (p+p- 1 ) 2 , ...,(p+ r'n- (53) 


Notice that the set T arises from adding 1 to a polynomial basis. We call S' a self-dual spanning set 
and T a polynomial spanning set. The fact that T is not a basis does not affect how we represent 
a field element as a polynomial based on T, i.e., a = Yli=o ai (P + P ~ 1 ) 1 ■, and fast multiplication of 
two polynomials of this form still works. 

Let Si = P l + P~ l , U = (P + /3 -1 ) 1 , and let s' and f' be the (n + l)-bit string output by the 
mapping defined earlier. We now describe the linear transformation L n+ \ that maps s' to t' for 
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all i (by right multiplication). The transformation L n+ i is not unique. A simple choice for L n+ i 
is based on the binomial expansion (/3 + /iC 1 )- 7 = ^ 7 =0 More precisely, for general k, we 

can choose Lk as 

JO if i > j or j — i is odd, 

(Lk) l> j { m °d 2 otherwise, 

where 0 < i, j < k. The operation Lk can be reversed. Lk is upper-triangular with l’s on the 
diagonal, which implies det(Lfc) = 1, so Lk is invertible. 

Finally, we will find a unitary Lf. that induces Lk- (More precisely, we are inducing the matrix 
with identical diagonal blocks that is the (k—1 ) x (k— 1) submatrix of Lk with the first row and 
column omitted.) The unitary L n also induces a conversion from S' to T. In [38], the following 
theorem is proved. 

Theorem 2 ([55]). Right multiplying L n+ \ respectively) by the vector representation (a') 

of an element a E GF(2 n ) described above can be done using 0(n log n) operations (additions and 
multiplications) in GF(2). 


From this theorem, an efficient (classical) circuit for L n+ i can be built with 0(n log n) CNOT gates. 
The intuition is that L n+ \ can be decomposed as a product of O(logn) matrices, each with O(n) 
l’s. Since the linear transformation can be done with GF(2) additions and multiplications, it can 
be implemented with CNOT gates. A circuit for L~_! j can be obtained by running the circuit for 
L n+ 1 backwards. 


Here we prove Theorem [2] with a different approach - a recursive construction that also requires 
0{n log n) CNOT gates. First consider Lk as defined in Eq. (54) where k = 2* is a power of 2. 
Taking k = 8 as an example, 


1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 

0 

0 

0 

1 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

0 

1 

0 

1 

0 

1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

1 


We use two properties of Lk when k = 2 4 (see Ls above for an illustration): 


(55) 


( 1 ) 

( 2 ) 


Each Lfc consists of three non-zero blocks: two identical diagonal blocks which is Lk/ 2 and a 
block above the diagonal which we call (which is almost like Lk/ 2 turned upside down)|^] 


The first row of Tk /2 contains only zero’s. The (?-t-2) th row of Tk /2 is the (| — i) 1 " 11 row of 
Lk /2 (where 0 < i < k/2 — 2). 


We first explain why these two properties hold, as illustrated in Figure [3] Take the Pascal’s 
triangle (mod 2) with k rows, and rotate the entries 90 degrees counter-clockwise. This gives the 
(nontrivial) (i,j) entries of Lk when i > j and i — j is even. The stated properties for Lk primarily 

2 Note here we use F for something different from the previous section. 
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come from the fact that Pascal’s triangle (mod 2) with k rows consists of 4 triangles of k/2 rows, 
the middle one only has zero entries, and the other three are identical copies of Pascal’s triangle 
(mod 2 ) with k/2 rows. Also, the triangle is always left-right symmetric. Proofs of these are readily 
obtained from Lucas’ Theorerr^] [13] (a more accessible proof can be found online at |31j). 





1 0 '0 

0 



1 10 


1 

r 4 

1 ,0 

1 



1 1 1 


1 


'1 

0 



1 


1 


1 

1 


Li 

1 

1 


1 



Figure 3: An illustration of the Pascal’s triangle structure of the L$ matrix. Taking the left 
half of an 8 -level Pascal’s triangle and rotating counter-clockwise by 90 degrees, we obtain 
the L$ matrix. Note that the block T 4 is the horizontal reflection of the lower diagonal block 
L 4 with a downward shift, as described by property ( 2 ). 


If we multiply Lk to a vector, 


Lk/2 

Tfc/2 

V\ 


Lk/2 Vl + r k/2 V2 

0 

Lk/2 

. U 2 


Lk/2 L>2 


(56) 


Due to the relation between T k j 2 and L k / 2 , the above mapping can be induced by the unitary L k 
implemented by the circuit in Figure [4j Using standard recursion analysis, the circuit contains 
0(k\ogk) CNOT gates. 



Figure 4: An example of representation conversion circuit which demonstrates the recursive 
structure. 


For general values of k, let t = \ log 2 k~\ and apply the above construction to obtain L 2 t. We 
restrict the circuit for L 2 t to a sub-circuit with the first k registers and the CNOT gates between 
them to obtain a circuit for L k that still has size 0{k\ogk). 

A circuit for L/ 1 converting a vector from the self-dual representation to the polynomial rep¬ 
resentation can be obtained by running the circuit for L k backwards. The first qubit which corre¬ 
sponds to the additional “ 1 ” in S' is always |0) and it remains untouched during the computation. 


3 Consider the base-p representation of integers m and n , where m > n > 0, and p is prime: m = mo + m\p + 


+ m k p k , n = n 0 + n lP + ... + n k p k . Then, (’; l ) = (™°) (^) 


(m k \ 

VfT-fc 


mod p 
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Therefore, the first qubit can be safely removed in the circuit. It is kept in the analysis for concep¬ 
tual simplicity. 

Finally, we are ready to give the recipe for the fast multiplication of two elements a, r E GF(2 n ) 
represented in the basis S': 

1. Insert a zero at the beginning of the vector representations of a and r to get the vectors a' 
and r' with respect to the spanning set S'. 

2. Convert a' and r' to new representations a and f with respect to the polynomial spanning set 
T, using the circuit for 

3. Multiply a by r using Schonhage’s multiplication algorithm [33] (denoted by II r in figure 
[ 5 ]). The result is a vector with respect to the polynomial spanning set {1, /3 + /3 _1 , (/? + 

/r 1 ) 2 , ••-,(/? +r 1 ) 2 ”}- 

4. Apply the unitary L 2 n+i to the vector above so it is represented in the spanning set {1 ,/? + 
/3 _1 , (3 2 +(3 ~ 2 ,..., f5 2n +/3~ 2n }. Then, discard the first element which is always 0. The result is 
the vector representation with respect to the spanning set {/3+/3 -1 , (3 2 +(3 ~ 2 ,..., f3 2n +/3~ 2n }. 
Since (3 is the (2 n + l)-th root of unity in GF(2 2n ) (i.e., (3 2n+1 = 1), we have f3 + /3” 1 = 
/3 2n + (3~ 2n , p 2 + (3~ 2 = p 2n - 1 + (3~ 2n+1 , .... Therefore with n additional GF(2) CNOTs, 
the resulting vector can be reduced to the one with respect to the permuted self-dual normal 
basis S. 

In Step 3, Schonhage’s multiplication algorithm [33] uses a radix-3 FFT algorithm to do fast convo¬ 
lution. Readers not familiar with German may refer to [36] for another description of Schonhage’s 
algorithm. This multiplication algorithm requires 0(n log n log log n) operations (additions and 
multiplications). Additions can be implemented with CNOT gates. Multiplications involved in 
this radix-3 FFT are the ones between an element of the polynomial ring GF(2)[x]/ (x 2m + x m + l) 
(for certain m) and x (which is a 3m-th root of unity in GF(2)[x]/ (x 2m + x m + l)). The result 
of this kind of multiplications is a shift of coefficients and it can be implemented by SWAP gates. 
Therefore, the whole multiplication method can be implemented with 0(n log n log log n) CNOT 
gates. As an example, Figure [ 5 ] shows the implementation of II r in GF(2 5 ). 

It is easy to show that the radix-3 FFT algorithm has logarithmic depth: if the current step 
of this algorithm is working on a polynomial of degree k , in the next recursion step, it will work 
in parallel on three polynomials of degree [A;/3]. The total number of steps (i.e., the depth of the 
circuit) is therefore O(logn) for a polynomial of degree n. To multiply two polynomials of degree 
at most n, each recursion step essentially consists of three components: computing the radix-3 
FFT, recursively doing \y/n~\ multiplications of polynomials of degree at most [v^l (i n parallel), 
and computing the inverse radix-3 FFT. Using a similar analysis, the depth of the polynomial 
multiplication circuit is 0(log(n) + log(n 1//2 ) + log(n 1 ^ 4 ) + ... + 1) = O(logn). The logarithmic 
depth of the basis conversion circuit can be shown by its recursive structure (e.g., Figure 4). 
Therefore, the depth of the circuit for II r is O(logn). 

The ancillary qubits can be reset to |0) using standard techniques in reversible computing. The 
result is a circuit for II r for any non-zero r E GF(2 n ) with 0(n log n log log n) CNOT gates . 
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Figure 5: The implementation of II r for multiplication of a by r where a,r £ GF(2 5 ). Il r is 
an implementation of Schonhage’s multiplication algorithm. The input and output bits are 
with respect to a self-dual basis. 


6 0(n) implementations based on polynomial basis for GF(2 n ) 


In this section we present alternative circuit constructions for unitary 2-designs in terms of polyno¬ 
mial bases for GF(2 n ). The advantage of using polynomial bases is that the SL 2 (GF( 2 ?l )) generator 
( q r 2i) for r 7 ^ 0 is straightforward to implement^] with 0(n log n log log n) Clifford gates with depth 
O(logn), as described at the end of Section [ 5 } 

For the generator, G 1 ) we provide two different 0(n) circuit implementations in Subsections 
and 6.2 However, we do not currently know how to implement the last generator (5 0 ) using 


6.1 


only 0(n) gates. To circumvent this problem, we modify our ensemble for the unitary 2-design 
slightly. Instead of implementing every element of SL2(GF(2 n )), we implement the elements that 
are lower triangular (i.e., A2(GF(2 n ))), and we do this using 0(n) gates. This follows directly 
from combining the implementations for (^ ) and ({ 5) an d by using Lemma [bj We can also 

implement all M £ V2(GF(2 n )) with respect to the dual basis (we denote this unitary Um), because 
with respect to the dual basis, the operations that induce (£ r _i) and (J }) are H® n H r H® n and 
H® n T v H® n (respectively). In Subsection 6.3 we show how to combine the implementations of 
A2(GF(2")) in the primal basis and V2(GF(2 n )) in the dual basis to achieve Pauli mixing. This 
results in an exact unitary 2-design with the desired complexity. 


6.1 Implementation of (} ?) with 0(n log n log log n) non-Clifford gates 

Here we provide an implementation of G 1 ) using 0(n log n log log n) gates that can be organized 
so as to have depth O(logn). This construction uses non-Clifford gates but they compose to a 
Clifford unitary. (The next subsection contains a slightly less efficient construction using only 

4 In this section, “implement a mapping” abridges “implement the unitary that induces a mapping according to 
Definitions [8] or [TO]’ and so on. 
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Clifford gates.) 


The operation that we need to implement is T\y, defined in Eqs. (36) and (37) (with V set to 
W). Recall that W is the primal-to-dual basis conversion matrix of Eq. ([9]). Since we are setting 
the primal basis to a polynomial basis, W is a Hankel matrix: for all j, k,j', k 1 , if j + k = j' + k' 
then Wjk = Wj'k'- We make use of this property in this and the next subsection. From Eq. (36), 


T W \ C ) = i^7=l^k=l W jkCjC k ^y 


(57) 


Note that it suffices to compute the exponent of i using mod 4 arithmetic, and the exponent has 
the form 


ci ■■ 


i]W 


Cl 


(58) 


This problem is related to the problem of computing convolutions. Recall that the convolution 
of two d-dimensional vectors u and v is defined as the (2 d — 1 (-dimensional vector w such that 

wo+wiT + w%T 2 H-b W 2 d- 2 T 2d ~ 2 (59) 

= T u\T + v 2 T 2 + • • • + Ud—iT d ^ (^vo + v\T + v^T 2 + • • • + Vd—\T d ^ (60) 

as polynomials over T. The product of a Hankel matrix with a vector reduces to convolution, as 
shown in the next proposition. 

Proposition 3. The product of an n x n Hankel matrix with an n-dimensional vector reduces to 
the problem of computing the convolution of two (2 n — 1)-dimensional vectors. 


Proof. This can be seen by comparing 


Xl 

X2 

fin 


y\ 

X2 

X3 

^n+1 


V2 

Xn 

Xn +1 

%2n— 1_ 


J)n_ 


(61) 


with the middle components of the convolution of [aq,..., X 2 n -i] and 
convolution is a (4 n — 3)-dimensional vector that is the vector in Eq. 
components on the left and n — 1 components on the right. 


[0, • • •, 0, y n ,... ,yi\. The 
([ 6 l|) padded with 2 n — 2 

□ 


Returning to the computation of Eq. (58), we can compute ei,... ,e n S Z 4 , given by 


ei 


Cl 


= w 




Cn 


(62) 


with a fast algorithm for polynomial multiplicatioij^] over the ring Z 4 using only 0(n log n log log n) 
gates (see, for example, Theorem 8.23 in [363). Then Eq. (58) for the exponent for i in T\y can be 


5 We conjecture that it is possible to slightly reduce the gate count for this construction from 0(n log n log log n) 
to (nlogn)2 0< ' 1 ° E n - > by employing the improved algorithms for integer multiplication initiated by Finer [15| . 
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obtained from the 2 n ancillary qubits containing ei,... ,e n (each ej is a two-bit string) and the n 
qubits containing ... ,c n as follows. For each j G {1 ,... ,n}, apply a controlled-Z gate between 
the high order bit of ej and cj and apply a controlled-S' gate between the low order bit of ej and Cj. 

This construction explicitly uses the non-Clifford controlled-S gates, since the underlying ring 
is Z 4 and addition mod 4 requires non-Clifford gates. The construction uses polynomial multi¬ 
plications, so it follows from the circuit depth analysis in Section [5] that the circuit depth of this 
construction is O(logn). In the next subsection, we describe a different procedure for implementing 
( 11 ) that is slightly less efficient, but uses only Clifford gates. 


6.2 Implementation of ( ] 5) with 0(nlog 2 nloglogn) Clifford gates 


Here we provide an implementation of (j 5) using 0(n log 2 n log log n) Clifford gates that can be 
organized so as to have depth 0(log 2 n). In the previous subsection, the computation is reduced 
to a convolution in mod 4 arithmetic, and we needed non-Clifford gates to compute this efficiently. 
Here, we use a recursive procedure that is based on convolutions in mod 2 arithmetic, which can 
be performed efficiently with Clifford gates. We assume all notation from the previous subsection. 

To simplify our presentation, we assume that n is a power of 2 (though our approach can be 
generalized to arbitrary n by dividing unevenly in the recursive step, as n = + [§]). We divide 

W into four f X | blocks as 


’w (11) if (12) ’ 

IT (21 ) W ( 22 ) 


(63) 


where W (12 \ W^ 2l \ VF^ 22 ) are | x | Hankel matrices and IfE 12 ) = W^ 21 ' 1 . Define 



0 

IT (12 )' 


Wf 11 ) O' 


'0 0 

A = 

iy( 21 ) 

0 

j B — 

0 0 

5 C ~ 

0 wW 


Clearly, 


1 

0 

h -*H 

1 _ 


I o' 


1 

O 


7 O' 

W I 


A I 


B I 


C I 


(64) 


(65) 


so we can implement T^, T#, and Tc separately, and compose them to obtain r^. 


We first show how to implement T^ using 0(n log n log log n) gates. From Eq. (37), 


r A | c ) = (-i)Sj5Efe=„/2+iM0fcc,c fc | c ^ 


( 66 ) 


The expression for the exponent of —1 above can be computed in mod 2 arithmetic, and has the 
form 


[<*■■■ c»] W (12) 


C - + l 
2 ^ 

Cn 


(67) 


Once again, by Proposition [3j the above product of a Hankel matrix with a vector reduces to 
convolution, and hence polynomial multiplication over the field GF(2). We can compute the bits 
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en + i,... ,e n , defined as 


e -+l 

2 ^ X 

= w (12) 

C -+l 

2 ^ 



Cn 


(68) 


in ^ ancillary registers using only 0(n log n log log n) gates. Moreover, since the convolution is with 
respect to entries of W —which are constants in our setting—all the gates can be Clifford gates (in 
fact, CNOT gates). Then we can apply 0(n ) controlled-Z gates between the bits e ™ +1 ,... ,e n and 
ci,..., c» (respectively) to apply the phase that correctly implements T^. 

What remains is to compute T b and Tc. Each of these is equivalent to computing an instance 
of the original problem of size n/2. In the bottom of the recurrence (when IT is a 1 x 1 matrix), 
a single S (phase) gate computes T\y- The gate cost G(n) of the recursive procedure satisfies the 
recurrence 


G(n) = 2G(n/2) + 0(n log n log log n), (69) 

whose solution satisfies 

G(n) E 0(n log 2 n log log n). (70) 

This recursive construction needs polynomial multiplication in each recursion step. According to 
the circuit depth analysis for polynomial multiplication in Section [5j the circuit depth is 0(logn + 
log § + ... + 1) = 0(log 2 n). 


6.3 Pauli mixing from A 2 (GF(2 n )) and V2(GF(2 n )) in different bases 


Here, we show how to achieve Pauli mixing by implementing Um for M E A 2 (GF(2 n )) and Um 
for M E V2(GF(2 n )). We will explain our approach in two parts. In the first part, we explain the 
actual generation and construction of the ensemble of unitaries—which is simple, but the resulting 
ensemble no longer corresponds to SL 2 (GF(2 n )), so it is not clear that the ensemble is a unitary 
2-design. In the second part, we prove that the new ensemble is Pauli mixing, so it is indeed a 
unitary 2-design. 


The construction is based on the following decomposition of elements of SL 2 (GF(2 n )), along 
the lines of Eq. (fl8|): 



if a 7^ 0 


if a = 0. 


(71) 


Note that all matrices in this decomposition are lower triangular, upper triangular, or (5 q ) • Lower 
triangular matrices can be implemented in the primal basis; upper triangular matrices can be 
implemented in the dual basis; and (? q) can be implemented in any self-dual basis (by H® n ). 

The procedure to generate an element of the ensemble is as follows. 
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Generation procedure: 

1. Sample 5 ) £ SL 2 (GF( 2 n )) according to the uniform distribution. 

2.1 If a 7 ^ 0 then 

set Mi to (“ Q Ii) 

set M 2 to ( p j a ?) 

construct the Clifford group element Um 2 0 Um 1 (composition of two circuits). 

2.2 Else if a = 0 then 

set M to (] 7 -i) 

construct the Clifford group element XJm 0 H® n (composition of two circuits). 

Note that the composition in step 2.1 is along the lines of the first case of Eq. ( |71[ ) and the com¬ 
position in step 2.2 is along the lines of the second case of Eq. ([ 71 ]). In each case, a Clifford group 
element with gate complexity 0(n log n log log n) (or Clifford-gate complexity 0(n log 2 n log log n)) 
results; however, the subset of all Cliffords that can arise by this procedure does not have the struc¬ 
ture of SL 2 (GF( 2 Tl )) because of the disparate coordinate systems being used for the components. 
This concludes the description of the generation and construction of elements of the ensemble. 

We now explain why the ensemble resulting from the above procedure is a unitary 2-design 
in spite of the mismatched bases used to convert the matrices arising from Eq. ( |71[ ) into Clifford 
unitaries. First, we consider the mixing property over the Paulis that results from Um f° r a 
random M £ A 2 (GF( 2 n )), and similarly for V 2 (GF( 2 n )). Partition the non-zero elements of 
GF(2 n ) x GF(2 n ) into these two (disjoint) subsets: 

Ri = { ( l ) £ GF(2 n ) x GF(2 n ) : a = 0 and b ^ 0} (72) 

R 2 = {(?) £ GF(2 n ) x GF(2 n ) : a / 0}. (73) 

It is straightforward to verify that a random element M £ A 2 (GF( 2 n )) uniformly mixes within R\ 
and it uniformly mixes within R 2 in the following sense. 

Lemma 7. Let M £ A 2 (GF(2 n )) be chosen uniformly at random. Then, for any (jj) £ R\, the 

distribution M (£') is uniform over R\ and, for any (£) £ R 2 , the distribution M(f) is uniform 

over R 2 . 

A similar result holds for \72(GF(2 n )) with a and b switched in the definitions of R\ and R 2 
(we omit the simple proof of this). 

To illustrate the consequences of Lemma [7] on the Paulis, we can organize the n-qubit Paulis 
into rows and columns where Z ^- b J is in column a and row b. We choose the first row and 

column to be labeled by a = 0 and 6 = 0 and call them the zero row and zero column. The relative 
ordering of the remaining rows and columns does not affect our discussion; they are collectively 
called the nonzero rows and the nonzero columns. Figure [ 6 ] shows such a layout for the n = 2 case 
where the identity Pauli is excluded. 

Based on Lemma [7| conjugating by Um for a uniformly distributed M £ A 2 (GF(2 n )) causes 
the zero column to mix uniformly and also the complement of the zero column (consisting of all the 
nonzero columns) to mix uniformly. We call this effect lower-triangular Pauli mixing. Schematically, 
this is illustrated in Figure [7| We can similarly define upper-triangidar Pauli mixing , corresponding 
to a transposed version of FigurejTj Sampling M £ V 2 (GF( 2 n )) and then constructing the Clifford 
unitary Um achieves upper-triangular mixing. 
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Figure 6: A natural arrangement of all the non-trivial 2-qubit Paulis into rows and columns. 
Pauli mixing requires a uniform distribution on the 15 items. 




Figure 7: Illustration of lower-triangular Pauli mixing. Top: mixing effect within the zero 
column. Bottom: mixing effect within the complement of the zero column (JV = 2 n ). 


We define one additional form of mixing, that we call column Pauli mixing, illustrated in 
Figure [8j where Paulis in the zero column do not change and any Pauli in a nonzero column mixes 
within its column. Such mixing is accomplished by choosing M = (« ?) for a uniformly random 
/3 E GF(2 n ), and then constructing the Clifford unitary Um- 




Figure 8: Illustration of column mixing. Top: elements in the zero column stay put. 

Bottom: elements in any nonzero column uniformly mix within the column (N = 2 n ). 

From Ecp we can deduce that our procedure is applying a probabilistic mixture of the two 
procedures below. With probability ^pr ^ a PPli es Procedure A; with probability gwpi it applies 
Procedure B (yrpy is the probability that a = 0 for a random E SL2(GF(2 n ))). 
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Procedure A: 


1. Apply an upper-triangular Pauli mixing operation. 

2. Apply a column Pauli mixing operation (independently from the first step). 

Procedure B: 

1. Apply H® n (thereby transposing the layout of the Paulis). 

2. Apply a lower-triangular mixing operation. 


We now prove that the above mixture of Procedures A and B results in Pauli mixing. 


Lemma 8. The stochastic process of applying either Procedure A or Procedure B, with probabilities 


2 n +l 


and 


l 


2”+l 


(respectively) is Pauli mixing. 


Proof. For convenience, let N = 2 n . First, consider an initial Pauli in the zero row (i.e., 6 = 0 
and it is of the form for some a / 0). Then, as illustrated in Figure [ 9 J if Procedure A is 







1 

N(N - 1 ) 


Figure 9: Illustration of mixing procedure starting in the zero row (N = 2 n ). Top: Proce¬ 
dure A. Bottom: Procedure B. 


applied, the result is a uniform distribution over all nonzero columns, where the probability of each 
Pauli is N ^_i^ • On the other hand, if Procedure B is applied, the result is a uniform distribution 
on the zero column, where the probability of each Pauli is yrzi- Consider the mixture of these 
distributions (Procedure A with probability and Procedure B with probability ^C_). Since 
ivTT N(N-i) = J-i and IvTTiWI = n^-i 1 result is the uniform distribution. 

Next, consider the case of an initial Pauli that is not in the zero row (i.e., with 6 7 ^ 0). 

Then, as illustrated in Figure[l0| if Procedure A is applied, the result is a two-level distribution: the 
probability of each Pauli in the zero column is N ^_-^ ; the probability of each Pauli in any nonzero 
column is jfj. On the other hand, if Procedure B is applied, the result is a uniform distribution over 
the nonzero columns, where the probability of each Pauli is N ^_-^ ■ Consider the mixture of these 

distributions (Procedure A with probability and Procedure B with probability -^Cj). Since 
tvTT n(n~i) = n^-l and WTIW + ]vTT jv(jv-i) = the result is the uniform distribution. □ 
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Figure 10: Illustration of mixing procedure starting in a nonzero row (N = 2 n ). Top: 
Procedure A. Bottom: Procedure B. 
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A Proof of Lemma [T] and Corollary [l] 

Lemma [lj Let £ be any ensemble of unitaries inVjy. Then, the following are equivalent: 

(1) £ is degree-2 expectation preserving. 

(2) £ is 2-query indistinguishable. 

(3) £ implements the full bilateral twirl. 

(4) £ implements the full channel twirl. 

Corollary |lj For £ = {p i: Ui} k i=v let £^:={pi,uj} ^ 
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(a) £ implements the full bilateral twirl if and only if £^ does. 

(b) £ implements the full channel twirl if and only if £^ does. 

Proof. We will show, in order, (1) => (2) =>■ (3) (1), Corollary [lj a), then, (2) => (4) => (3), and 

finally Corollary [ljb) . 

(1) =£- (2): Consider any distinguishing circuit C making up to two queries of U or U'. Note that 
the output state 7/2 (C, U) is a product of matrices with at most two factors of U and two factors 
of . Thus, each entry of 772 (C) U) is a polynomial of degree at most 2 in the matrix elements of 
U and at most 2 in the complex conjugates of those matrix elements. By hypothesis, £ is degree-2 
expectation preserving, thus the following holds entrywise: 

k 

Pi r) 2 (C,Ui ) = 

2=1 

and £ is 2-query indistinguishable. 

(2) =>• (3): This follows from the definition that the bilateral twirl circuit is a special case of a 
2-query distinguishing circuit C. 

(3) =>■ (1): Let {Ij}}^^ be a basis for C N . Suppose £ implements the full bilateral twirl, so, Vp, 

k 

Pi Ui®Uip uj <g> Ul = 

2=1 

Since the density matrices span the complex Hilbert space of all possible square matrices of the 
same dimension, the above relation holds if we replace p by |ai)(o 3 | <8> |a2)(a4|, for all a-i, a 2 , <23,04 G 
{1, • • • ,N}. Furthermore, we can left- and right-multiply the above equation by (<25 1 <g) (<26 1 and 
|<27} <g> | as). This gives 

k 

Pi(a 5 \U i \a 1 )(a e \Ui\a2}(a 3 \Uj\a 7 )(a4\Uj |o 8 ) = 

2=1 

Repeating the above for all possible ai, • • • , as and applying linearity implies Eq. Q and that £ is 
degree-2 expectation preserving. 

Corollary [IJa): From Definition [ 2 J £ is 2-query indistinguishable iff £l is. Thus, by the equivalence 
between (2) and (3), £ implements the full bilateral twirl if and only if does. 

(2) =>■ (4): This follows from the definition that the channel twirl circuit is a special case of a 
2-query distinguishing circuit C. 

(4) =>■ (3): We provide a proof for the most general unitary 2-design here. Readers interested in 
the special (but common) case when the ensemble £ consists only of Clifford unitaries and N = 2 n 
can consult Appendix [B] for a short proof. 

We begin with some relevant concepts in quantum information. Let 11), • • • , | N) denote an 
orthonormal basis for C N , B(C N ) denote the set of all bounded N x N matrices, and $ = 
1001 ® 1001 - Let X denote the identity map on B(C N ). For any linear map 0 : B{C N ) — »• 





dp{U)U ®U pU^ . 


(75) 



dp(U) 772 (C,U) 


(74) 
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B(C n ), denote the Choi-matrix of 0 by J(@) = (0<g>Z)(<I>) = 0(|Z)O'I) ® 1001 LSI- © is 

completely positive if and only if J(0) is positive semidefinite [9] (see also [26j[39]). A quantum 
channel is a linear, trace-preserving, and completely positive map. 

Suppose for every quantum channel A, Ef(A) = E M (A). Then, J(Ef(A)) = J(E /i (A)). Rephras¬ 
ing this equality using Eqs. ©> and ([6]), we have 


J2 Pi (tf® 1 ) (A ®Z)((Ui®I) $ (uj®l)) ( Ui®I) = jdfi(U) (tf®I) (A<g>X)((E7<8>7) (yi®I)) ( U®I). 

(76) 

We transform each side of the above equation in 3 steps, turning the Choi matrix of the twirled 
channel into the bilateral twirl of an operator closely related to the Choi matrix of A. First, for 
the LHS of Eq. (76), we apply the transpose trick (C7j <8> I) $ (llj ® I) = (/ ® Uf) <5 (/ <g> J7*), where 
T and * denote the transpose and the complex conjugate respectively. Second, we commute the 
conjugation by (I <8> Uj) with A ®Z. We apply similar manipulations on the RHS of Eq. (76). The 
equation becomes 


k .. 

Y^Pi(u}®U?)(A®T)($)(U i ® u t) = dp(U)(tf ®U T )(A®1)($)(U®U*). (77) 

i= 1 


Third, we apply to Eq. (77) the partial transpose of the second system: for any Ai, Ai € B(C N ), this 
linear map takes A\ ® A 2 to A\ ® A%. In particular, the partial transpose of (I <S> Uf) ($) (I ® U *) = 




is equal to E^=i 1001 ® ( U]\j)(l\Ui ) = (/ ® Uj)(x)(I ® Ui ) where 


X = J2i 7=1 1001 ® \J)( l \ i s the swap operator on C N ® C N . Eq. (77) becomes 


k „ 

Y,Pi(u\® u l){A®Z){x){U i ®U i )= dp(U)(tf ®U ] )(A®l){x){U®U) (78) 

2—1 


which is equivalent to 

T £t ((A®l)(x)) = T M ((A®I)(x)). (79) 

(In the above, we have used the fact d[i(U^) = dp{U).) Altogether, the transpose trick, the 
commutation, and the partial transpose transform Eq. (|76|) concerning the equality of the Choi- 


matrices of the two channel twirls for A into Eq. (79) establishing the equality of the two bilateral 
twirls of the matrix (A ® Z)(x)- 

It remains to apply Eq. ( [79] ) to a set of carefully chosen A’s to show that T^\{A) = T^(A) for 
a basis {A} of the input space. This will show that £I implements the full bilateral twirl. By 
Corollary [lj £ also implements the full bilateral twirl and the proof will be completed. 

We consider A’s with a specific form. Let 77 be the completely randomizing map on B(C N ), 
i.e., 7 Z(p) = ( Trp)I/N for all p £ B(C N ). Note that J(77) = (I ® /)/7V. Consider any bounded 
linear map A that is trace preserving and for which J(A) is Hermitian (the latter property is called 
hermiticity preserving). Then, for sufficiently small, positive, A, A = (1 — A)77 + AA has positive 
semidefinite Choi-matrix (because the Choi-matrix of 77 is proportional to the identity), and is 
therefore completely positive. Furthermore, A is linear and trace-preserving. So, A is a quantum 
channel. When we apply Eq. (79) to such A’s, the 77 terms cancel out (because (77 ® Z){x) = 
(/ ® I)/N which is invariant under either bilateral twirl). Therefore, Eq. (|79[) holds for all linear, 


trace and hermiticity preserving maps A (which are easier to construct than quantum channels). 
We are ready to show that T £ ]{A) = 7)j(A) for a basis {A} of the input space. We take 


A = Hi 


Hj where is a basis for B(C N ) with the following additional properties: 
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(1) Each Hi is Hermitian. 

( 2 ) Hi = I/VN. 

(3) Tr(HiHj) = 8ij. In particular, Hi is traceless for l > 1. 

(4) The swap operator has a simple representation in this basis, 

d 2 

X = Y J Hi®H l . (80) 

i=i 


Such basis exists for all N. When N = 2 n , Hi can be taken to be proportional to the Pauli matrices 
(see Eq. (81) for the last condition). For general N. we show in Appendix [C] that the generalized 
Gell-Mann matrices can be used to construct such Hi s. 


We will verify that T^iiHi g) Hj) = T^{Hi g) Hj) for all 1 < l,j < d 2 by considering four cases. 
First, the equality is immediate for l = j = 1. Second, for each 1 < j < d consider Aij defined 
by Aij(H\) = Hi + Hj, and Ai j{H{) = 0 for all Z / 1. Ay is trace-preserving since each Hi is 
traceless for l > 1. Furthermore, (Aijg)Z)(x) = (H\ + Hj) ®H\ and partial transpo sing the second 
system gives J(Aij), which implies A\j is Hermitian. Therefore, we can apply Eq. (|79j) to Aij and 
conclude Tz\(Hj g) Hi) = T^Hj g) H\). Third, because of the symmetry of the bilateral twirl, 
Tzi(Hi g) Hj) = T^,{H\ g) Hj). Fourth, let 1 < j < l < d and consider Aji such that Aji(H\) = Hi, 
Aji(Hj) = Hi, and A ji{Hji) = 0 for all j' ^ 1 and j' ^ j. With arguments similar to the second 
case, T^i(Hi g) Hj) = T^{Hi g) Hj). This completes the proof. 


Corollary [T](b) : We have established the equivalence between (3) and (4), thus, by Corollary [l](a) 
£ implements the full channel twirl if and only if t ^ does. 

□ 


B Short proof for (4) => (3) in Lemma [l] 

Here, we consider the special case when £ = {pi, Ui} is an ensemble with Clifford unitaries and 
N = 2 n . We will show that if £ implements the full channel twirl then it implements the full 
bilateral twirl. 

The proof relies on several definitions in Section |3j We will show that if £ implements the full 
channel twirl then it is necessarily Pauli mixing, and the rest follow from Lemma [2] Consider an 
ensemble £ = {pt, Ui} with Clifford unitaries Ui such that E^(A) = E^(A) for all quantum channels 
A. Take an arbitrary Pauli matrix P E Q n with P ^ I and an overall phase so that P = P\ 
Let A (p) = PpP^. On one hand, E^(A)(/9) = Yli=i Pi {U}P Ui) p (U-P Urf. On the other hand, 
E^(A)(p) = (1 — A )p + 2 2 ,^_ 1 Z^QeQ„\{/} QpQ 1 f° r some 0 < A < 1. Note that for each i, Ui is in 

the Clifford group so ujPUi is a Pauli matrix. Thus, we have two Kraus representations for the 
same twirled channel, both with Kraus operators in the quotient Pauli group Q n , which is a basis 
for 2 n x 2” matrices over C. Invoking Theorem 8.2 of m concerning the degrees of freedom over 
these Kraus operators, the z-th term of E^(A) can only contribute to Q in E^(A) if and only if 
UjPUi is equivalent to Q in Q n (see Section . Finally, each Q ^ I appears with equal weight in 
E At (A)(p), thus the distribution {pi, U-P Ui} is uniform over Q n \{L}. 
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C Construction of the basis {Hi} 

We want to be a basis for B(C N ) with the following additional properties: 

(1) Each Hi is Hermitian. 

( 2 ) H x = I/y/N. 

(3) Ti(HjHj) = Sij. In particular, Hi is traceless for l > 1. 

(4) The swap operator x = Hf=i Hi® H[. 

We use the generalized Gell-Mann matrices for the construction. Let H\ = I/y/N. For l = 
2, • • • ,N, let Hi = Di/yjl{l— 1) where Di is a diagonal matrix with (A)i,i = • • • = (Di)n_ i^-i) = 1, 
(Di)l,l = ~(l ~ 1), and (A)jj = 0 for Z+l < j < d. For 1 < ji < j 2 < d, let X juh = (|ji><J 2 | + 
lj' 2 ) (ji |)/'\/2) Yjujs ~~ *( — |ji)(j 2 | T |J 2 ) {ji |)/v / 2- Let {LCk, 5 j — 

with any ordering. Then, { Hi }//j span B(C N ), each Hi is Hermitian, and TriH/Hj ) - Finally, 
the expression for the swap operator x can be verified by checking that each of the d 4 matrix 
entries on the RHS has the value given by the LHS. The verification involves routine arithmetic, 
each off-diagonal element involves only 2 terms, and the diagonal elements can be expressed as 
simple telescopic sums. 


D Elementary proof that Pauli mixing implies a unitary 2-design 


Lemma [2| Let £ be an ensemble of Clifford unitaries and £q be as defined in Section [3| If £ is 
Pauli mixing, then £q implements the full bilateral twirl. 


Proof. The goal is to show that Te Q {p) = T/p) for all density matrices p. Note that both Te Q and 
% are linear transformations on 2 2n x 2 2n matrices. Therefore, it suffices to show that Tg Q and 7^ 
act identically on a basis for these matrices. We consider a basis that contains the identity matrix 
I 2 n and the swap operator X 2n acting on 2 n qubits, completed with matrices M trace orthonormal 
to I‘ 2 n and X 2n (i.e., Tr(I 2n M) = Tr(X 2n M) = 0). We will prove the following three claims: 


1 - Tfj,(I 2n ) — Ts Q {I 2n ) — I2m 

2. Tfj,(X 2n ) = Ts Q (X 2n ) = X 2n , and 

3. if TV(/ 2 „M) = Tr(X 2n M) = 0, then T/M) = T £q (M) = 0. 


Recall from Eqs. © and Q that 


Up) 


J dp(U) U ®U pU^ and 


T £q ( P ) = J2Pi 2 ~ 2n ( U i R i ® UiRj) p ( PL)U\ ® R]U\). 
hi 


It follows that the first claim holds trivially. Furthermore, since X 2n (A ® B)X 2n = B ® A, or 
equivalently, X 2n (A® B) = (B ® A)X 2n , the second claim follows. 

To prove the third claim, it suffices to show 7 £q (M) = 0. This is because, for any 2 2n x 2 2n 
matrices M, 7f(M) = T fJ ,(T £Q (M)). In turns, this is due to the fact that MV£\I 2 n,MM,T/M) = 
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®VMV'®V')\ applying the last identity to each unitary in £q and invoking linearity gives 
the desired result. 

We now show that Ts Q (M) = 0. We make a crucial observation that X .2 = %(I®I + X<g)X + 
Y (g) Y + Z <S) Z), and thus 

x -2n = 2 ^ 22 Rl®Rl- ( 81 ) 

Rl&Qn 

Now, we use the fact that Q n is a basis for 2 n x 2 n matrices to write M = Y2 a b a abRa <8> Rb for some 
a a b £ C. We take Ro = I n € Qn, so, the two conditions on M can be rephrased as aoo = 0 and 
Y2 a a aa = 0. By linearity, we focus on analyzing Ts Q (R a <8> Rb) for any (a, b) / (0,0). Note that 


Ts Q {R a ®Rb) = 22 Pi ( Ui ® U i ) 


2 _2n (Rj ® Rj) (R a ® R b ) (Rt <g> Rj) 


(C//®[//). (82) 


If o b, 3c such that R c commutes with R a and anticommutes with R b . So, 


2 22(Rj ® Rj) ( Ra ® R&) (Rj ® 

j 

= 22( r 3 ® Rj) (Ra ® R&) (Rj ® Rj) + 22 ( R;i r< - ® RjRc) (Ra ® 14 ) (rJr] ® rJr}) = 0 

j j 

and Ts Q (R a <S> Rb) = 0 . If a = b, 


22 2 ~ 2n (Rj ® Rj) (Ra ® -Ra) (Rj ® Rj) = (Ra ® Ra 


(83) 


Substituting the above into Eq. (82) and using the fact that £ is Pauli mixing, we obtain 


1 


(R« ® Ra) — 22n_l X!/ Rj ® Rj ~ T 

Rj£Qn\{I} 

for a matrix T independent of a. Putting all the pieces together, 


(84) 


Ts q (M) = ^ Q. ab Ts Q (Ra <8> Rfc) = ^ Oi aa Te Q (Ra ® R<j) = ( ^ «aa j T = 0. 


(85) 


ab 


□ 


E Lower bounds for size and depth of unitary 2-designs 

Let £1 = {pi, Ui}\ = i be any exact unitary 2-design on n qubits. We show that a high probability set 
of the unitaries have size fi(ra) and depth fl(logn), assuming a universal gate set consisting of 1 - 
and 2-qubit gates. Both proofs invoke only Definition [3j and they apply to unitary 2-designs that 
approximate the exact operation under Definition [2] or [3] in the diamond norm. 

Suppose the circuit for U. t acts nontrivially on qubits. We will show that > n/2, 

so, on average the circuit size is at least n/ 2 . Since £ implements the full bilateral twirl, the 
quantum operation p —> Y2i=i Pi^ipUj = 9 W is the complete randomization map on n qubits. For 
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each j, consider the input |0)(0| on the jth qubit. Since the output on the jth qubit is with 
probability at least }, it has been acted on by one of the Uf s. Define a matrix with rows labeled by 
i = 1 , • • • ,k, and columns labeled by j = 1 , ■ ■ ■ ,n, and the ( i,j ) entry is pi if Ui acts nontrivially 
on qubit j. The above argument implies that each column sums to at least 1/2. Also, by definition, 
the ith row sums to SiPi. The total of the row sums is equal to the total of the column sums, so, 
Y2i=iPi s i ^ n/2, as claimed. Furthermore, consider the set S = {i : Si < n/ 4}. If YliesPi > 2/3, 
Yli=iPi s i / n /2j so, with probability at least 1/3, the circuit has size at least n/ 4. 

For the lower bound on the depth, consider the bilateral twirl Ts applied to the matrix Z ® 
I® n ~ l ® Z ® I®"- 1 , 

k 

T £ (Z ® I® 11 - 1 <g> Z <g> J®”- 1 ) = (C^Z®!®"" 1 )!//). (86) 

i= 1 

Express each £/j(Z ® /®»»—i)pt as a linear combination of Pauli matrices, and define the weight ti 
to be the number of qubits that are acted on nontrivially by at least one of the terms. Since each 
gate interacts with at most two-qubits, if the depth of the circuit for Ui is di, then di > logij. We 
now show that most U t (Z ® J® n-1 )[// have weight ti > nj 2. 

From Appendix [Dj 

Te (Z® I® n ~ l ®Z® /® n_1 ) = Rj ® Rj (87) 

Rj&Q n \{I} 


The fraction of Rj 's with weight less than n/2 is equal to 



< 4 


Ln/2j 

E 

1=0 


n 


on/2 ^ q—n ^ rjn on/2 

l ) ~ ’ 2 ’ " 


0 . 866 n . 


Let 7 = {i : ti > n/2}. Then J2 ie7 Pi —t 1 but in particular Ylie^Pi ^ 1/2, because otherwise, the 
RHS of Eq. (86) and (87) cannot be equal. 


F Pauli group permutations are uniquely induced 


Lemma 9. Suppose that unitaries U and V have the property that they induce the same permutation 
on the Pauli group so that, for all a,b G {0, l} n , 

UX a Z b U ] = VX a Z b V\ (88) 

where = means equal up to a global phase that can be a function of a and b. Then V = UX c Z d for 
some c,d£ {0, l} n (up to a global phase). (Here a and b are binary strings, as opposed to elements 
of GF(2 n ), so we do not require the notation [a] and \b\ that occurs in other sections.) 


Proof. Note that Eq. (88) is equivalent to 

x a z b (uW)(x a z b y = A a , b uW 


for all a, b € {0, l} n where A a ,b is the global phase in Eq. (88). We can express U^V as 


U f v = J2 a c,d,X c Z d . 

c,d£{ 0,l} n 


(89) 

(90) 
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Recall that X a Z b and X c Z d either commute or anticommute, depending on the value of the sym- 
plectic inner product]^] of (a, b) and (c,d) (they commute when (a, b) ■ ( c,d ) = 0 and anticommute 
otherwise). Using this fact and substituting Eq. ( [90] ) into Eq. ([89]), we obtain 

^ ( _1 )l*Mc,*) aCidX c z d = J- A a , b a c , d X c Z d . (91) 

c,de{o,i} ri c^ejo,!} 11 

Since the Paulis X c Z d are linearly independent, the coefficients must match. 

We now show that at most one a Ct d can be nonzero. Suppose two are nonzero: a C i,di / 0 ^ a C2 ,d 2 
for some (ci, d\) ^ (c 2 , ^ 2 )- Then there exists (a, b ) such that (a, b) ■ (c\,di) / (a, b) ■ (C 2 , cfe)- Then, 
from Eq. ( [91] ) , we can deduce that 

(_ 1 )(a, 6 )-(c 1 ,d 1 ) = ^ = (_i)(a, 6 )-(c 2 ,d 2 ) j ( 92 ) 

which is a contradiction. Therefore there is a unique nonzero a Ci d, which implies V = a c ^UX c Z d . 

□ 


6 The symplectic inner product is defined as (a, 6) ■ (c, d) = ((B k=1 a k d k ) © {(B k=1 b k c k ). 
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